AITopics | sequential decision problem

Adaptive Concentration Inequalities for Sequential Decision Problems

Neural Information Processing SystemsMar-17-2026, 07:28:27 GMT

A key challenge in sequential decision problems is to determine how many samples are needed for an agent to make reliable decisions with good probabilistic guarantees. We introduce Hoeffding-like concentration inequalities that hold for a random, adaptively chosen number of samples. Our inequalities are tight under natural assumptions and can greatly simplify the analysis of common sequential decision problems. In particular, we apply them to sequential hypothesis testing, best arm identification, and sorting. The resulting algorithms rival or exceed the state of the art both theoretically and empirically.

artificial intelligence, name change, proceedings, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.43)

Add feedback

e58fa6a7b431e634e0fd125e225ad10c-Paper-Conference.pdf

Neural Information Processing SystemsFeb-12-2026, 12:35:49 GMT

inference task, timestep, transformer, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > Canada > British Columbia > Vancouver (0.04)
(7 more...)

Genre: Research Report (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Add feedback

5141f6bc105d30edbae48f1d2e0b1e66-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-8-2026, 23:13:42 GMT

agent, implementation and hyperparameter sweep, input dimension, (10 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.70)

Add feedback

Uni[MASK]: Unified Inference in Sequential Decision Problems

Neural Information Processing SystemsDec-25-2025, 13:48:23 GMT

Randomly masking and predicting word tokens has been a successful approach in pre-training language models for a variety of downstream tasks. In this work, we observe that the same idea also applies naturally to sequential decision making, where many well-studied tasks like behavior cloning, offline RL, inverse dynamics, and waypoint conditioning correspond to different sequence maskings over a sequence of states, actions, and returns. We introduce the UniMASK framework, which provides a unified way to specify models which can be trained on many different sequential decision making tasks. We show that a single UniMASK model is often capable of carrying out many tasks with performance similar to or better than single-task models. Additionally, after fine-tuning, our UniMASK models consistently outperform comparable single-task models.

name change, sequential decision problem, unified inference, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Adaptive Concentration Inequalities for Sequential Decision Problems

Neural Information Processing SystemsNov-21-2025, 14:28:21 GMT

A key challenge in sequential decision problems is to determine how many samples are needed for an agent to make reliable decisions with good probabilistic guarantees. We introduce Hoeffding-like concentration inequalities that hold for a random, adaptively chosen number of samples. Our inequalities are tight under natural assumptions and can greatly simplify the analysis of common sequential decision problems. In particular, we apply them to sequential hypothesis testing, best arm identification, and sorting. The resulting algorithms rival or exceed the state of the art both theoretically and empirically.

adaptive concentration inequality, name change, sequential decision problem, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.43)

Add feedback

Uni [MASK]: Unified Inference in Sequential Decision Problems Micah Carroll 1, Orr Paradise

Neural Information Processing SystemsAug-19-2025, 14:29:02 GMT

Randomly masking and predicting word tokens has been a successful approach in pre-training language models for a variety of downstream tasks.

artificial intelligence, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > Canada > British Columbia > Vancouver (0.04)
(7 more...)

Genre: Research Report (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Add feedback

A code

Neural Information Processing SystemsAug-14-2025, 20:18:32 GMT

This section is meant to give an overview of our opensource code. Together with this git repo, we include a'tutorial colab' - a Jupyter notebooks that can be run in the browser without requiring any local installation at We view this open-source effort as a major contribution of our paper. We present the testbed pseudocode in this section. Recall from Section 3.1 that we We now describe the other parameters we use in the Testbed. In this section, we describe the benchmark agents in Section 3.3 and the choice of various Step 3: compute likelihoods for n = 1, 2, . . .

agent, implementation and hyperparameter sweep, input dimension, (10 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.70)

Add feedback

Uni[MASK]: Unified Inference in Sequential Decision Problems

Neural Information Processing SystemsJan-19-2025, 04:02:42 GMT

Randomly masking and predicting word tokens has been a successful approach in pre-training language models for a variety of downstream tasks. In this work, we observe that the same idea also applies naturally to sequential decision making, where many well-studied tasks like behavior cloning, offline RL, inverse dynamics, and waypoint conditioning correspond to different sequence maskings over a sequence of states, actions, and returns. We introduce the UniMASK framework, which provides a unified way to specify models which can be trained on many different sequential decision making tasks. We show that a single UniMASK model is often capable of carrying out many tasks with performance similar to or better than single-task models. Additionally, after fine-tuning, our UniMASK models consistently outperform comparable single-task models.

sequential decision problem, single-task model, unified inference, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

HyperQ-Opt: Q-learning for Hyperparameter Optimization

Hasan, Md. Tarek

arXiv.org Artificial IntelligenceDec-23-2024

Hyperparameter optimization (HPO) is critical for enhancing the performance of machine learning models, yet it often involves a computationally intensive search across a large parameter space. Traditional approaches such as Grid Search and Random Search suffer from inefficiency and limited scalability, while surrogate models like Sequential Model-based Bayesian Optimization (SMBO) rely heavily on heuristic predictions that can lead to suboptimal results. This paper presents a novel perspective on HPO by formulating it as a sequential decision-making problem and leveraging Q-learning, a reinforcement learning technique, to optimize hyperparameters. The study explores the works of H.S. Jomaa et al. and Qi et al., which model HPO as a Markov Decision Process (MDP) and utilize Q-learning to iteratively refine hyperparameter settings. The approaches are evaluated for their ability to find optimal or near-optimal configurations within a limited number of trials, demonstrating the potential of reinforcement learning to outperform conventional methods. Additionally, this paper identifies research gaps in existing formulations, including the limitations of discrete search spaces and reliance on heuristic policies, and suggests avenues for future exploration. By shifting the paradigm toward policy-based optimization, this work contributes to advancing HPO methods for scalable and efficient machine learning applications.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

2412.17765

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Reviews: Negotiable Reinforcement Learning for Pareto Optimal Sequential Decision-Making

Neural Information Processing SystemsOct-8-2024, 21:06:08 GMT

Summary: This paper reasons about a Pareto optimal social choice function in which the principles seek to agree on how to agree to use a system that acts in a sequential decision-making problem in which the principles may not share the same prior beliefs. Results suggest that to obtain such a function, the mechanism must over time make choices that favor the principle who has beliefs that appear to be more correct. Quality: The work appears to be correct as far as I have been able to discern. However, I do not like the idea of not having the proof of the main theorem (Theorem 4) in the main paper, even if for the sake of brevity. My opinion is that If the theorem is that important, its proof should be next to it.

negotiable reinforcement learning, pareto optimal sequential decision-making, sequential decision problem, (7 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Filters

Collaborating Authors

sequential decision problem

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Adaptive Concentration Inequalities for Sequential Decision Problems

e58fa6a7b431e634e0fd125e225ad10c-Paper-Conference.pdf

5141f6bc105d30edbae48f1d2e0b1e66-Supplemental-Conference.pdf

Uni[MASK]: Unified Inference in Sequential Decision Problems

Adaptive Concentration Inequalities for Sequential Decision Problems

Uni [MASK]: Unified Inference in Sequential Decision Problems Micah Carroll 1, Orr Paradise

A code

Uni[MASK]: Unified Inference in Sequential Decision Problems

HyperQ-Opt: Q-learning for Hyperparameter Optimization

Reviews: Negotiable Reinforcement Learning for Pareto Optimal Sequential Decision-Making